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f In this article, we apply transfer function bounding techniques to obtain upper 
bounds on the bit-error rate for maximum- likelihood decoding of turbo codes con- 
structed with random permutations. These techniques are applied to two turbo 
' codes with constraint length 3 and later extended to other codes. The performance 
predicted by these bounds is compared with simulation results. The bounds are 
\ useful in estimating the “error door” that is difficult to measure by simulation, and 

' they provide insight on how to lower this door. More redned bounds are needed for 

accurate performance measures at lower signal-to-noise ratios. 

I. Introduction 

Simulations have shown that turbo codes can produce low error rates at astonishingly low signal-to- 
noise ratios if the information block is large and the permutations are selected randomly [3,4] . In addition 
to simulations, it is also useful to have theoretical bounds that establish decoder performance in the range 
where obtaining sufficient data from simulations is impractical. 

In this article, we apply transfer function bounding techniques to obtain upper bounds on the bit-error 
rate for maximum-likelihood decoding of turbo codes constructed with random permutations. The premise 
for these bounds is the same as for the usual transfer function bounds applied to standard convolutional 
codes [7] . The error probability is upper bounded by a union bound that sums contributions from error 
paths of different encoded weights. The state diagram of the code is used to enumerate the paths of each 
possible weight. 

The transfer function bounds for turbo codes differ from the usual transfer function bounds for convo- 
lutional codes in several respects. For turbo codes, these bounds require a term-by-term joint enumerator 
for all possible combinations of input weights and output weights of error events, even for bounds on 
the word error rate. Second, because turbo codes are block codes, it is crucial to accurately enumer- 
ate “compound” error events that can include more than one excursion from the all-zero state during 
the fixed block length, as shown in Pig. 1. Third, since explicit results are intractable for any partic- 
ular randomly chosen permutation, the bound is developed as a random coding bound. Finally, the 
bounds are derived as upper bounds on the bit-error rate of a maximum-likelihood decoder operating on 
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turbo-encoded data, whereas the iterative turbo-decoding procedure is not guaranteed to converge to the 
maximum-likelihood codeword. On the other hand, since the turbo decoder attempts to minimize the 
bit-error rate rather than the word-error rate, it can in some circumstances yield a slightly lower bit-error 
rate than a maximum-likelihood decoder. 

Transfer function bounds for turbo codes were first published by Benedetto and Montorsi [1,2], but our 
computation method allows for more accurate numerical results. Our algorithm uses a short recursion 
formula to calculate the necessary transfer function coefficients efficiently for large block lengths. We 
have found that the bound “diverges” (i.e., becomes useless) at low Eb/No, as does the corresponding 
type of bound for standard convolutional codes. 

Our approach in this article is first to explicitly develop the bounds for two exemplary turbo codes 
with constraint length 3. One code is the same reported in [2]. The second code should be superior to 
the first according to the heuristic arguments of [5]; it is presented to underscore the reasons why one 
code should outperform the other. Only after developing the two examples in detail do we extend the 
theory to additional codes. 


II. Turbo Code Examples 

In this section, we introduce the two particular turbo codes that will be used throughout this article 
to develop the bounds. 

A. Encoder Diagrams 

Figures 2(a) and 2(b) depict the two exemplary turbo encoders. Both encoders produce one uncoded 
output stream xq and two encoded parity streams xi,X 2 , for an overall code rate of 1/3. The parity 
streams come from simple recursive convolutional encoders with constraint length K = 3 (i.e., memory 
m = 2). For the first code, the parity sequences both correspond to a ratio of generator polynomials 
9a/gb, where ga{D) = \ + D + and gb{D) = 1-1- Z)^. For the second code, the parity sequences, 
both correspond to ps/pa- Representing pa as octal 7 and Qb as octal 5, the two codes are denoted by 
(1, 7/5, 7/5) and (1, 5/7, 5/7), respectively. The notation explicitly shows the method of generating 
each of the three output streams, one uncoded and two parity. We refer to these three separate rate- 
1 components of the code as code fragments. The turbo code is a parallel concatenation of its code 
fragments. By parallel concatenation, we simply mean adjoining several pieces of a codeword to form the 
full codeword. 

Figures 2(a) and 2(b) show each code fragment preceded by a permutation, tto, tti, or 7T2. This seem- 
ingly needless complication is introduced here for symmetry and to facilitate random coding arguments 
presented later. For these two examples, the only permutation relevant to the construction of the overall 
turbo code is the relative permutation it f ^ 1^2 or between the inputs ui and U 2 . In practice, the 

permutations ttq rmd tti are identities (i.e., no permutation). Each of the encoders in Figs. 2(a) and 2(b) 
is used to generate a {3{N + 2),N) block code, where N is the information block length. Following the 
information bits, an additional 2 “tail bits” are appended in order to drive the encoder to the ffil-zero 
state at the end of the block. The termination method described in [3] can be used. 
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(a) (b) 



Fig. 2. Two examples of turbo encoders: (a) the (1 , 7/5, 7/5) code and (b) the (1 , 5/7, 5/7) code. 


B. State Diagrams 

Each of the two exemplary turbo encoders is a four-state device. Figures 3(a) and 3(b) show the state 
transition diagrams for the nontrivial code fragments, (ga/db) and (gb/ga), respectively. In this diagram, 
each transition between states is labeled by the input information bit and the corresponding output 
encoded bit. It is convenient to replace each edge label in Figs. 3(a) and 3(b) with a monomial L’-PD'^, 
where I is always equal to 1, and i and d are either 0 or 1, depending on whether the corresponding input 
and output bits are 0 or 1, respectively. Then the information in the state transition diagrams can be 
summarized by state transition matrices A{L,I,D), where 


A7/5(L, /, D) 


/ L LID 0 0 \ 

I 0 0 LD LI \ 

LID L 0 0 

Vo 0 LI ldJ 


( 1 ) 


for the (7/5) code fragment, and 


^5/7{L, I, D) 


/ ^ 

LID 

0 

0 

0 

0 

LI 

LD 

LID 

L 

0 

0 

\ 0 

0 

LD 

LI 


( 2 ) 


for the (5/7) code fragment. 

C. Input-Output Weight Enumerator 

For a given code fragment, defined by a state diagram with 2"* states as in Section II. B, denote by 
/(/, i, d) the number of paths of length I, input weight i, and output weight d, starting and ending in state 
0”^, with the proviso that the last m edges in the path are “termination” edges and have neither length 
nor input weight. The corresponding transfer function (generating function) is defined by 


T(L, /, ^) = E E E b d) 

l>0 i>0 d>0 


(3) 
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Fig. 3. State diagrams of two code fragments: (a) the (7/S) code fragment and 
(b) the (5/7) code fragment. 


Using the method described in Section 4.7 of [8], we find that T{L, I, D) is the (0”*,0”*) entry in the 
matrix 


(I + A{L, I, D) + A(L, I, Df + A(L, /, Df + -- -)A(1, 1, Dr 


( 4 ) 


The factor A(l, 1, D)^ takes care of the termination edges. Since I + A + A^ 4- A^ H = (I - A) ^ , it 

follows from Eq. (4) that 


T(L, /, D) = [(I - A(L, /, D))-^ ■ A(l, 1, 


( 5 ) 


Using Eq, (5) (approximately, by omitting the termination factor A(l, 1, we find that the transfer 

functions for the (7/5) and (5/7) code fragments are 


Tr/s{L,I,D) 


l-LD- L'^D + r (T)2 - p) 

1 - L(1 + D) + 1,3 (£) + £)2 _ /2 _ J 2 £) 3 ) _ (£)2 _ J 2 _ / 2£)4 + / 4 £, 2 ) 


(6) 


and 


n/r{L,I,D) 


l-LI-ri- r (£)2 - P) 

1 - L(1 + /) - 1,3 (£>2 -I -p + PD'^) + P (D^ -P- PD^ + PD'^) 


( 7 ) 


respectively. Note that, in this approximation, I, D) — T^ij{L, D, I), i.e., the roles of input weight 

i and output weight d are reversed for the two code fragments. 

If we multiply both sides of Eq. (6) by the denominator of the right-hand side, and take the coefficient 
of t{l, i, d) of both sides of the resulting equation, we obtain the following recursion determining i, d) 
for the (7/5) code fragment, for / > 0, i > 0, d > 0: 
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t{l, i, d) = t{l — 1, i, d — 1) + t{l — 1, i, d) 


+ t{l - 3,i - 2,d - 3) + t{l - 3,i ~ 2, d) - t{l - 3, i, d - 2) - t{l - 3, i, d - 1) 

+ - 4, z - 4, d - 2) - <(/ - 4, i - 2, d - 4) - i(Z - 4, i - 2, d) + t{l - 4, i, d - 2) 

+ 6{l, i,d)-6{l-l,i,d-l)- 6{l - 2, i, d - 1) + 6{l - 3,i,d - 2) - 6(1 - 3,i - 2, d) 

where 6(1, i,d) = lif/ = i = d = 0 and 6(1, i,d) = 0 otherwise, and with the initial conditions that 
t(l, i,d) =0 if any index is negative. 

Similarly, t^j-j(l,i,d) for the (5/7) code fragment can be evaluated by the recursion 
t(l,i, d) = t(/ - 1, i - 1, d) + t(l - l,z, d) 

+ - 3, z - 3, d - 2) - t(/ - 3, i - 2, d) - t(/ - 3, z - 1, d) + t(l - 3, z, d - 2) 

- - 4, z — 4, d - 2) + t(/ - 4, z - 2, d - 4) + 1 (Z - 4, z - 2, d) — f(Z — 4, z, d - 2) 

+ 6(1, i,d)-6(l-\,i-l,d)-6(l-2,i-\,d)-6(l-3,i,d-2) + 6(l-3,i-2, d) 
again with the understanding that t(l, i, d) = 0 if any index is negative. 


III. Union Bounds on Word and Bit-Error Probabilities 

In this section, we use the input-output weight enumerators t(l,i,d) for the various code fragments 
to obtain a union bound on the probabilities of word error and bit error, assuming an additive white 
Gaussian noise channel with channel symbol signal- to- noise ratio Es/Nq- 

A. Derivation of the Bound 

As depicted in Figs. 2(a) and 2(b), the turbo code is constructed as a parallel concatenation of its three 
code fragments, each preceded by a random permutation of the input information bits u. The randomly 
chosen permutations tto, tti, and 7T2 transform the input sequence u into three permuted sequences uq, 
ui, and U2, each having the same Hamming weight as the original sequence u. Since the turbo code has 
block length N, there are t’ji^(N,i,d) codeword fragments of input weight z and output weight d from 
the two (7/5) code fragments of the (1, 7/5, 7/5) code, and ts/j(N,i,d) codeword fragments of input 
weight z and output weight d from the two (5/7) code fragments of the (1, 5/7, 5/7) code. 

Denote by p(d\i) the conditional probability of producing a codeword fragment of weight d given a 
randomly selected input sequence of weight z. Then^ 


p(d\i) 


t(N, i, d) 
Ed' t(N,i,d') 


t(N,i,d) 

~w 


^ The summation Y^^t{N,i,d) equals the total number of codewords of information weight i, (^)- However, this is not 
exact if t(N,i,d) is computed according to the approximation that does not account for the termination edges. 
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The conditional probability distributions P 7 / 5 (d|i) and Ps/ 7 {d\i) are plotted in Figs. 4(a) and 4(b) for the 
two code fragments (7/5) and (5/7), for block length N = 100. For the uncoded fragment, pi{d\i) = 6{i,d). 

Note from Figs. 4(a) and 4(b) that the (7/5) code fragment admits only even input weights i, while 
the (5/7) code fragment has only even output weights d. The vertical scale of Fig. 4(b) is twice that of 
Fig. 4(a) to reflect the concentration of probability into even-only output weights for the (5/7) fragment. 
The figures also show for reference a binomial probability distribution for 100 trials with probability 1 /2 
(in the case of Fig. 4(b), the reference “binomial” distribution is twice the binomial probability for even 
weights only). 

In both Figs. 4(a) and 4(b), the conditional probability distribution p{d\i) approaches the binomial 
reference for moderate to large values of input weight i, indicating a more or less random distribution of 
output weights d. In contrast, the skewed distributions for low values of i are what differentiate the two 
code fragments from each other and the corresponding overall turbo codes from purely random codes. In 
particular, notice how the p{d\i) distribution for input weight i — 2 is more skewed toward lower output 
weights d for the (7/5) fragment than for the (5/7) fragment. 

If the permutations are selected randomly and independently, the probability p{do,di,d 2 \i) that any 
input sequence u of weight i will be mapped into codeword fragments of weights do, di, and d 2 is 


p{do,di,d2\i) = Plido\^)P7/5{dl\i)P7/5{d2\^) 


for the (1, 7/5, 7/5) code and 

p{do,di , d 2 |i) = Pi {do\i)p 5 /r{di \i)p5/7{d2 N ) 

for the (1, 5/7, 5/7) code. The conditional probability that a maximum-likelihoo d decoder will prefer a 
particular codeword of total weight d = do+di+d 2 to the all-zero codeword is Q{\j2dEa/No), where Q( ) 
is the complementary unit variance Gaussian distribution function. Thus, the codeword error probability 
Pu, is upper bounded as follows: 


Pw = Prob (error event of weight i] < 

where the conditional expectation E,j|i {•} is over the probability distribution p{do,di,d 2 \i)- Similarly, 
the information bit-error probability Pf, is upper bounded by 

= weight ^<i|i |q j 

The error probabilities P^j and Pb bounded in Eqs. (8) and (9) are averages (over randomly chosen 
permutations) of the word and bit error probabilities achieved by any particular turbo code with specified 
permutations 7ro,7ri,iT2. 

B. Evaluation of the Bound for the Examples 

Figures 5(a) and 5(b) show the bounds on Pb for the (1, 7/5, 7/5) code and the (1, 5/7, 5/7) code 
for various block lengths N. Note that the transition from a well-behaved, useful, low Pb bound into 
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0.10 





Fig. 4. Conditional probability distribution p{d\ i) for output weight d given input weight /: 
(a) code fragment (7/5) and (b) code fragment (5/7). 


a diverged, useless bound greater than 1 occurs very abruptly if the block length N (i.e., permutation 
length) is large. The abrupt transition occurs roughly when the information bit signal-to-noise ratio Eb/No 
drops below the threshold determined by the computational cutoff rate Rq, i.e., when Es/Nq = rEb/No 
< — ln(2^“'’ — 1) for a code with rate r [9]. This behavior mimics that of similar bounds applied to totally 
random codes, which turbo codes resemble. 


50 






E^/A/o, dB E^JNq, dB 

Fig. 5. Bounds on Pf) for various block lengths N\ (a) the (1,7/5, 7/S) code and (b) the (1, 5/7, 5/7) code. 


In computing these bounds, we discovered and overcame two distinct types of pitfalls. First, there is 
an inherent numerical precision problem that we have solved for block lengths up to about 1000. Second, 
for low Eb/No, there is a sinister “false convergence" region where the bound seems to have converged 
to an unchanging value, but after remaining at this constant value for many terms, it suddenly diverges 
to a useless probability bound greater than 1. Figure 6 illustrates this false convergence behavior. For 
this figure, the summation in the union bound expression in Eq. (9) is truncated after terms on the 
assumption that higher-order terms in i will not contribute to the sum. For large block lengths (e.g., 
N = 400 or AT = 1000 in the figure), this assumption seems to be validated because the cumulative 
summation becomes almost totally flat after a small fraction i'^/N of all the terms. However, when 
i~^/N reaches about 0.2, the cumulative summation starts increasing rapidly before saturating at a much 
higher level than the first plateau. This illustration of the false convergence behavior is for Eb/No = 
2.00 dB, which is just barely below the Rq threshold of 2.03 dB for rate 1/3 codes. The effect is even 
more dramatic if Eb/No is decreased further. On the other hand, when Eb/No is above the Ro threshold, 
false convergence is not a problem. Figure 7 shows how quickly and truly the summation converges when 
Eb/No = 2.50 dB. The curves in Fig. 7 are also plotted versus the fraction /N in order to show the 
full range of for all values of N simultaneously. This demonstrates that the second plateau observed 
in Fig. 6 is absent at the higher value of Eb/No- However, it is apparent from Fig. 7 that only a handful 
of terms (roughly i < 10) are needed for convergence in this case, and this is almost independent of the 
value of N. 

When we first attempted to evaluate these bounds, we were fooled by the false convergence region and 
were computing error rates low enough to contradict Shannon’s limit! After we learned how to properly 
evaluate the bounds, we found that other researchers (e.g., [2]) were unaware of the intricacies of the 
divergence and had computed the bounds inaccurately at low Eb/No- However, in the next section, we 
will see that this divergence is an artifact of the bound; the error rate from an actual turbo decoder does 
not diverge at the Rq threshold. 

C. Comparison of the Bounds With Turbo Decoder Simulation Results 

Figure 8 compares the computed bounds for the (1, 7/5, 7/5) and (1, 5/7, 5/7) turbo codes with 
simulated turbo decoder bit-error rates. We observe that, above the Rq threshold of 2.03 dB, the simulated 
turbo decoder bit-error rate closely matches the error rate predicted by the bound. Below this threshold. 
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the turbo decoder experiences its own region of “divergence” wherein its performance deteriorates rapidly 
because its iterative decoding algorithm frequently fails to converge. However, this “divergence” is far 
less steep than that experienced by the bound, and it occurs well below the Rq threshold, allowing turbo 
decoders to operate in the region between the limit determined by channel capacity and that determined 
by Rq- I 

We observe in Fig. 8 the relative performance of the two exemplary turbo codes and the relative 
performance of the same codes with different block lengths. The (1, 5/7, 5/7) code is clearly superior 
to the (1, 7/5, 7/5) code in the region where the bound accurately predicts turbo decoder performance. 
This is consistent with the heuristic arguments presented in [5]. However, when Eb/No is low enough that 
the decoder’s iterative algorithm stops being effective, the two codes perform similarly, and the heuristic 
arguments do not apply. By comparing the results for block lengths N = 100 and N = 1000, we also see 
how the performance of turbo decoders at high Eb/No is dramatically improved by increasing the length 
of the random permutation. 



f/j/A/o, dB 


Fig. 8. Transfer function bound versus simuiated bit-error rates for 
the (1 , 7/5, 7/5) and (1 , 5/7, 5/7) codes. 


D. The Turbo Decoder Error Floor 

Unfortunately, the region where turbo codes have offered astounding performance is below the compu- 
tational cutoff rate threshold, so at first glance the bounds appear to be of dubious utility. Nevertheless, 
our work has some immediate applications and suggests some refinements. For Eb/No above the com- 
putational cutoff rate threshold, we believe that the bound is not only meaningful but that it essentially 
tells the whole story, i.e., the bit-error rate predicted by the bound is accurately achieved both by a 
maximum-likelihood decoder and by a turbo decoder. This is demonstrated by the confluence of the sim- 
ulation and bound performance curves in Fig. 8 at high Eb/No- In this region, evaluation of the bound 
requires only a few terms in the summation, and its behavior is predictable from the more heuristic 
analysis about the relationships of weight distributions, permutations, and the number of codes reported 
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in [5]. This low-slope region of the bound establishes the elusive “error floor” that several researchers, 
including ourselves, have noted but have had difficulty establishing clearly via simulations, because the 
error floor for large block-length turbo codes is too low to simulate accurately. 

The error floor is actually not flat, but instead is a low-slope region of the performance curve, wherein 
the turbo decoder’s error rate decreases very slowly with increasing Eb/No. The slope of the error floor 
is limited by the weakness of the turbo code’s simple constituent codes, but the position of the error 
floor can be lowered by increasing the permutation length N. Empirically, the error floor appears to be 
extrapolatable backwards through the computational cutoff rate barrier to some (as yet undetermined) 
lower Eb/No where it finally stops being an accurate predictor of turbo code performance. Furthermore, 
the extrapolated error floor in this region appears to be computable as the “false convergence” plateau 
we noted earlier. Thus, we surmise that, for some values of Eb/No below the i?o threshold, the false 
convergence region of the bound actually corresponds to a true convergence region for predicting turbo 
decoder performance: even though the bound diverges, the portion of the bound based only on low-weight 
input sequences is still a useful predictor of performance. 


IV. Results for Other Codes 

Our results thus far have been developed with respect to the two exemplary turbo codes for concrete- 
ness. The theory obviously generalizes to arbitrary turbo codes constructed as parallel concatenations 
of code fragments. The enumerators t{l,i,d) must be evaluated for each code fragment, and the union 
bound is obtained as a summation over products of independent enumerators. Some results for rate 
1/3 and 1/4 codes with constraint lengths 3 and 4 are shown in Fig. 9 comparing the bounds with the 
Shannon limits for these rates. 



Efc/Wo, dB 


Fig. 9. Transfer function bounds for other codes. 
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V. Conclusion and Further Work 


One of the most important lessons we have learned is that the divergence properties of these bounds 
for turbo codes appear to be the same as those of similar bounds applied to random codes. This ob- 
servation leads us to try to adapt known bounding techniques that diverge at capacity, rather than the 
computational cutoff rate, when applied to random codes. Foremost among the candidate techniques 
we propose to evaluate are the Gallager bound and the bound based on the “code geometry function” 

[6] . Preliminary work is encouraging, but is currently limited by extreme numerical computation barriers 
when the block length is large. However, we are developing analytical approximations that are valid 
asymptotically as the block length gets larger. 
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